This document is a guide to WildPackets' language for decoding packets in the Peek products. The language described here enables our customers to extend the decoders we have provided or to create new ones. We have devised a simple, interpreted language designed specifically for displaying network frames for several media types. The language includes instructions for displaying data, controlling the sequence of instructions, manipulating data values, and conditional execution. The format of decoder files, a detailed description of each instruction and its effects, and the tools needed for creating decoder files are discussed.
The decode of a protocol is accomplished through a collection of decoders. Each decoder is uniquely named and contains a sequence of decode instructions. An important part of the decoding process is the ability of decoders to link to other decoders. This allows a modular design, and eliminates the need for duplication of decoders for the decoding of different packets with the same first layers. Decoders of one file may use decoders of others.
The decoding process follows the layers of the protocols from the bottom layer to the top, and as layers are encountered, decoders will call other decoders and relinquish the decoding task to them. There is some strong presumption that the bottom layers appear first in the packet although there is some ability to handle the decoding when this is not exactly the case.
Each decoder is a combination of one or more decoding instructions. Most instructions are composed of a decode instruction, a value, some option fields, and a string. When the user decides to decode a packet, the decode manager is called with a pointer to the first byte of the packet. As decode instructions are encountered and executed, this pointer is incremented as dictated by the instruction type, in preparation for the next instruction. Packet data that is processed by the instruction may be saved for later use in global or local variables.
Stated another way, the pointer to the data in the packet and a decoder instruction operate on one or more bytes in the packet and the pointer moves along correspondingly. In some cases, an instruction may operate only on bits within a byte.
When the decode manager runs out of instructions, it will display the rest of the packet in a generic data format. Depending on the media type, the last of the remaining bytes are reserved for the FCS (4 bytes, Ethernet or Wireless), or not (0 bytes, Token Ring). If the decode manager runs out of packet data before exhausting the instructions an error message is displayed, notifying the user that not enough bytes were available. Nevertheless, using the different patterns of instructions, it is still possible in some cases to scan beyond the end of the packet or to create infinite loops so some caution should be exercised.
A decoder is a sequence of one or more instructions. A typical instruction has a type, a value, an option, a data style, a label style, and a string (that may be empty).
Type | A four character code identifying the decoder instruction. |
---|---|
Value | A signed 32-bit value. The two high bits are used to provide options for certain instructions. |
Global | A signed 32-bit value. The 26 high bits are used to provide options for certain instructions. The low 5 bits represent the index of a global variable. |
Data Style | A 1-byte value containing fields for text location and style. |
Label Style | A 1-byte value containing fields for text location, style, and a new line option. |
String | An ASCII string to be used as a label or the name of another decoder or a string array. |
Fundamentally, a single decoder instruction is intended to provide enough information to yield a label and a value after that label. The decode instruction also includes formatting information about both the label and data parts. Of course, there is no requirement to do both a label and data portion in each instruction and there are some decoder instructions that just operate on the data without displaying either a label or data, or instructions which just control the flow of instructions: test and branch, subroutine call, etc.
The decoders have access to 32 long word (32-bit) global areas, where they may save information for later use. There is also one implicit Boolean global, for use in test instructions. These globals provide the ability to interpret data in later parts of a packet, in accordance to values in earlier bytes of that packet.
Bit(s) | Meaning |
---|---|
0x00000040 | Local bit on global |
0x00000080 | Register bit on value; |
0x00000100 | Previous Test bit (TEQU family only) |
0x00080000 | Constant value bit |
0x0FF00000 | Number of bits to skip |
0x20000000 | Skip bit |
0x40000000 | Branch bit |
0x80000000 | Debugger break |
The decoders have access to 32 long word (32-bit) local variables, where they may save information for later use inside a particular decode. These local variables provide the ability to use data from earlier parts of a decode. To use a local, instead of setting the Most Significant Bit(MSB), you will instead set the 2nd MSB. See example below, which adds 3 to the local variable 2. **Note: You can only move global variables to local variables.
ADDG 3 42; * add 3 to local 2 ADDG 3 82; * add global 3 to local 2 ADDG 40000003 c2; * add local 3 to local 2As indicated by any given instruction, the global field will determine which local to use in the operation. For example, after reading two bytes of the packet, you may wish to store it in local number 3, for use later, like the example above.
Several decoder instructions perform tests to be used in other conditional instructions. For example, one instruction may compare packet data to a value or a value to a global and then a subsequent instruction will conditionally display a string or branch to another decoder. The decoder mechanism keeps track of the status of tests. See the table of instruction descriptions for instructions that take advantage of this feature. Here is an example:
void Test; MOVE 2 1; * move 2 into g2 SEQU 2 1; * is 2 == g1 ? SKIP 1; LABL -They are not equal; TRTS;
Below is a variation of the example using ENDS instead of specifying the actual number to skip:
void Test; MOVE 2 1; * move 2 into g2 SEQU 2 1; * is 2 == g1 ? SKIP; LABL -They are not equal; ENDS; TRTS;
And the "coup de grat" is the following example that sets the "skip bit" on the SEQU that instructs it to skip the instructions up to the ENDS instruction. In other words, if the test is true the enclosed block is executed.
void Test; MOVE 2 1; * move 2 into g2 SEQU 2 20000001; * if 2 == g2 ? LABL -They are equal; ENDS; TRTS;
In the above example the skip bit is the 3rd highest significant bit on the global. The skip bit is supported by all of the S* functions including SEQU, SNEQ, SLTE, SGTE, and SBIT.
The other major component of decoders are string arrays referred to here as "str#" resources (because they're identified by the string "str#"). String arrays are identified by a unique name and are composed of a list of strings located with a 1-based index. Indexed string arrays provide their own index. Decoders can make use of string arrays by extracting data from a packet, possibly doing some arithmetic, and either displaying a string from an array or calling a function specified in an array.
String arrays are identified by the lowercase string "str#" followed by a unique name. Each string in the array appears on a line by itself terminated by a semicolon. Here is an example:
str# MyStrings; String1; String2; String3; ; String5; String6;
The above string array is accessed in a linear manner. The line with only a semi-colon specifies an empty string entry. A variation of the string array is the indexed string array. This type of str# allows the index of entries to be specified with a hex value of the entry followed by a '|' followed by a string. Here is an example:
str# MyStrings2; 1 | String1; 5 | String5; 8 | String8; 9 | String9;
Another variation of the string is the mixed string array which includes indexed and non-indexed entries. Here is an example:
str# MyBitStr; 3 | 1... Bit 3 is set; 0... Bit 3 is not set; 2 | .1.. Bit 2 is set; .0.. Bit 2 is not set; 1 | ..1. Bit 1 is set; ..0. Bit 1 is not set;
The string array in the example above is used with the BST# instruction.
Decoders are stored in text files located in a directory adjacent to the
program. Under MacOS, this directory is called "Packet Decoders";
under Windows, simply "Decodes". Individual decoders are identified by
the lowercase string "void" followed by a unique name. Individual
instructions follow the format: [instruction][value] [global] [data format]
[label format] [string], where each element is separated by a tab character or
one or more spaces. Each line is terminated by a semicolon. Here is an example:
void FTP; TEQU 0 10 0 30 FTP No Cmd Data; TLTE 5dc 10 0 30 FTP No Cmd Data; LABL 0 0 0 b1 FTP Control - File Transfer Protocol; TNXT 0 0 0 0 FTP Cmd or Reply; **
The same function could also be written as:
void FTP() { TEQU( 0, g[0x10], 0, 0x30, "FTP No Cmd Data"); TLTE( 0x5dc, g[0x10], 0, 0x30, "FTP No Cmd Data"); LABL( 0, 0, 0, 0xb1, "FTP Control - File Transfer Protocol"); TNXT( 0, 0, 0, 0, "FTP Cmd or Reply"); }
**Notes: A dash can be used in place of the four number parameters when they are all equal to zero, i.e.,
TNXT -FTP Cmd or Reply;
can replace:
TNXT 0 0 0 0 FTP Cmd or Reply;
The TSUB instruction has become implicit for subroutine calls. It is not necessary to prefix a subroutine call with TSUB anymore. For example, what used to be:
TSUB 0 0 0 0 Foo;
or:
TSUB -Foo;
can also be simply:
Foo;
Specifying TSUB will still work but is not necessary.
Assignments and math on locals and globals can be done using the following syntax:
g[1] = 1; l[1] = 2; g[4] = l[4]; l[3] = g[6]; g[1] += 1; l[1] += 2; g[4] += l[4]; g[7] += g[6]; l[3] += g[6]; l[3] += l[6]; g[1] *= 0; l[2] /= 2; g[3] %= 1; g[4] &= l[4]; g[5] |= 5;
Tests on locals and globals can be done using the following syntax:
if g[2] == 1 { LABL 0 0 90 c2 "g[2] == 0"; } if g[2] == l[2] { LABL 0 0 90 c2 "g[2] == l[2]"; } if l[1] == g[1] { LABL 0 0 90 c2 "l[1] == g[1]"; }
Tests can also be nested:
if g[1] <= l[2] { LABL 0 0 90 c2 "g[1] <= l[2]"; if g[2] <= l[1] { LABL 0 0 90 c2 "g[2] <= l[1]"; } }
Packet data can be accessed in the following ways:
g[1] = pb[1]; * means GBYT 1 1; g[1] = pb[0]; * means GBYT 0 1; g[1] = pw[0]; * means GWRD 0 1; g[1] = pl[0]; * means GLNG 0 1; l[1] = pl[0]; * means GLNG 0 41;
In the above examples 'p' stands for packet, 'b' for byte, 'w' for word, and
'l' for long
The following is example of while loops:
void main() { g[1] = 0; while (g[1] < 5) { g[1] += 1; DGLB( 0, g[1], 0x90, 0xc2, "g[1]:"); } g[1] = 5; while (g[1] > 0) { g[1] -= 1; DGLB( 0, g[1], 0x90, 0xc2, "g[1]:"); } }
Packets usually contain the information needed to decode them into their protocol components. However, some protocol stacks use transactions in the form of a request packet followed by a response packet and the response packet does not contain any information about the type of request to which is responding. For example, in AFP (the AppleTalk File Protocol), a request will be made to get information about a file, and the response will contain the requested information, but the response packet does not identify itself as being the answer to a file information request. Because of the lack of information in the response packet, it could just as easily be interpreted as a response to a get volume parameters request. As a solution to this problem, the decoder language contains the WHOA instruction. This instruction takes the name of an indexed string as its string parameter. Each name in the string array is a choice of how to decode the next portion of the packet and this name is also the name of the decoder to be used. These choices are presented to the user in a dialog and the value field contains a number that is the index of the default choice that is highlighted in the list.
With this as the basis, it should be clear that the user must manually select a decode choice for those protocols that do not identify their responses. As an implementation detail, the program tends to remember the last choice selected and uses that one again for subsequent packets. However, such inference is often incorrect.
To extend the capabilities of the decoders, an additional mechanism was added that is referred to here as threaded decoding. First, the MARK instruction is added so that when a request packet is encountered, the packet is "marked" with the relevant information; in particular, the type of response that will be subsequently expected. Then, when the WHOA instruction is encountered at some later packet, the WHOA instruction checks to see if there is a marked packet that tells it how to decode the current packet. If so, the user is NOT prompted for input, but instead it is assumed that the decoding should be done as the previous MARK instruction indicated.
To clarify and illustrate the mechanism, consider the following example. A certain protocol sets up a socket-to-socket connection. Requests are made and non-request identifying responses are given. In the implementation of the protocol, the response is directed to the requester to a socket number that is unique on the requester's node. To write a decoder program for this situation, during the decode of the request the MARK instruction is processed. The string field of the MARK instruction contains the name of a string array that contains the names of all the possible response types. The value field tells which option is the correct one for this request. The source socket number is contained in a global variable and this global is passed to the MARK instruction. In processing the MARK instruction, the name of the string array (the str#), the choice of which item to use in the array, the number of the packet and the socket number passed in the global variable are stored together in LIFO memory, a.k.a., a cache.
In writing a decode program for the response, at the appropriate point where different types for the response are possible, the WHOA instruction is called. The string field of the WHOA instruction contains the name of a string array that contains the names of the all the possible response types. The value field tells which of these options is the default. The destination socket of the response has been saved in a global variable. In processing the WHOA instruction, the cache is searched for a match of both the socket number value contained in the global and the string array name. If a match is found, the choice from the possibilities, as stored in the cache by the MARK instruction, is used to decode the next portion of the packet. Moreover, if the WHOA instruction contains the number of a global variable in the high 8 bits of the value field, that global will get the number of the packet that contained the request placed into its value. If there is a match in the cache, execution continues with the designated decoder. If there is no match and the user makes a choice from those presented, execution continues with the corresponding chosen decoder. In all other cases, including when cancel is chosen, execution continues with the instruction following the WHOA instruction. Please see example below:
Mark and Whoa code example
1) Open decodes.dcd
2) Search for void SNMP
void SNMP; ASN1_Init; MOVE 0 11; *enum testing is off LABL 0 0 0 b1 SNMP - Simple Network Management Protocol; WHOA 0 1 0 0 SNMP Exp Opt; WHOA 0 2 0 0 SNMP Exp Opt; SEQU a1 1; SKIP 5 0; SEQU a2 1; SKIP 3 0; MARK 2 2 0 0 SNMP Exp Opt; WHOA 0 2 0 0 SNMP Exp Opt; TNXT -Summary SNMP Fields; MARK 2 1 0 0 SNMP Exp Opt; WHOA 0 1 0 0 SNMP Exp Opt; TNXT -Summary SNMP Fields;3) Notice MARK on line 11. It translates to MARK this packet as a possible help for a following packet, use the string array SNMP Exp Out to figure out the next possible decode. The value field tells which string to use based on the value global field, which is 2, later if this value in the global is matched then it will use the decode specified in the string array at position 2. It then stores this in a LIFO(Last In First Out) cache.
Decoder programming is much like assembly language programming with very long assembly instructions. The following examples serve to illustrate how we have programmed the decoding of some of the protocols.
This is a typical, simple decode instruction:
HWRD 0 2 90 c2 Checksum:;
This instruction translates to: decode the next 16 bits as hexadecimal, the high order bit of value is zero so prepend a "0x" (ignore remaining bits in value), store the 16 bits in global variable number 2, and format the display as: new line, label at column 2, in data label style, with the given string, data in plain style beginning at column 24.
Here is another example:
HBYT 0 1 90 c2 AppleTalk Type:; CEQU 1 1 0 14 Short DDP; TTST 0 0 0 0 Short DDP; CEQU 2 1 0 14 Long DDP; TTST 0 0 0 0 Long DDP;
These instructions do the decoding of the byte that contains the LAP type in an AppleTalk DDP packet. After displaying the value in hexadecimal, a translation string of "Short DDP" or "Long DDP" is displayed in the message style if the LAP type is equal to 1 or 2 respectively. Further, if either the type is 1 or 2, decode execution switches to a decoder named "Short DDP" or "Long DDP" respectively.
The best source of examples is the decoders we've provided with the product. If you're using Microsoft Windows you can view the decoders with a text editor such as NotePad or WordPad. If you're using MacOS you can view the decoders with a text editor such as BBEdit.
Remember, your decoder and string arrays should be uniquely named over the set of all decoder files. If you fail to do this, it is not a fatal problem, but it can be confusing because, most likely, only the first occurrence of an item will be found. Many protocol stacks have names or acronyms that are exactly the same. Choose your decoder names judiciously. Also, old decoder files must be moved out of the directory tree of the application.
Inevitably, you will need to link your decoders into the parsing stream of the decoder mechanism by changing at least one of the existing decoder files. Here are some hints for commonly used control points.
1. Make a copy of the "decodes.dcd" file as a backup and move it to a
safe place.
2. Open the original file.
3. Locate the last instruction in the "LSAP::Names" decoder which is a
"TRTS".
4. Add a new "CEQU" instruction immediately before the last "TRTS"
instruction.
5. Edit the instruction value to be the hexadecimal value of the new LSAP.
6. Edit the instruction string to be the name of the new 802.2 type.
7. Base the rest of the instruction on preceding "CEQU" instructions.
The next steps add the jump point to the new LSAP type. Usually there are two
possibilities: two jump points, (one for the DSAP and one for the SSAP) or one
jump point which requires that both the DSAP and SSAP have the same new value.
8. Locate the decoder called "802_2::Common".
9. Go to the bottom of the decode "802_2::Common", you will see a
"802_2::Data", this is a branch instruction similar to:
TSUB 0 0 0 0 802.2 Data;or
TSUB -802.2 Data;Follow the the branch to the new decode "802_2::Data".
1. Make a copy of the "decodes.dcd" file as a backup and move it to a
safe place.
2. Open the original file.
3. Locate the decoder called "SNAP::Names".
4. Locate the last instruction in the "SNAP::Names" decoder which is a
"TRTS".
5. Add a new "CEQU" instruction before the "TRTS".
6. Edit the instruction value to be the hexadecimal value of the new SNAP.
7. Edit the instruction string to be the name of the new SNAP type.
8. Base the rest of the instruction on preceding "CEQU" instructions.
9. Locate the decoder called "SNAP".
10. Locate the last "TEQU" instruction.
11. Add a new "TEQU" instruction after the
TEQU 80c5 2 0 0 VINES Echo;instruction.
1. Make a copy of the "decodes.dcd" file as a backup and move it to a
safe place.
2. Open the original file.
3. Locate the decoder called "IP Common".
4. Add a new "TEQU" instruction at the top.
5. Edit the instruction value to be the hexadecimal value of the new IP type.
6. Edit the instruction string to be the name of the new decoder that handles
the IP type.
7. Save and close the file.
1. Make a copy of the "decodes.dcd" file as a backup and move it to a
safe place.
2. Open the original file.
3. Locate the decoder called "TCP::Ports_Str".
4. Add a new entry above the first entry which should be
0x7 | Echo;that will test the source port number stored in global 2.
If your decoder work has value to other users of our products, we would be happy to distribute your decoders along with our own, giving you credit for your work.
In every decode instruction the data format is one byte. It can be divided into two 4-bit sections: the high 4 bits signify the columnar location to display the data (the data location bits) and the low 4 bits signify the style (color, etc.) in which the data should be displayed (the data style bits).
Bit | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|
Field | Format | Style |
Value | Meaning |
---|---|
0x0 | Display data immediately with no spaces. |
0x1 | Display data after a carriage return. |
0x2 | Skip 2 spaces, then display data. |
0x3 | Skip 4 spaces, then display data. |
0x4 | Display data beginning in column 4. |
0x5 | Display data beginning in column 8. |
0x6 | Display data beginning in column 12. |
0x7 | Display data beginning in column 16. |
0x8 | Display data beginning in column 20. |
0x9 | Display data beginning in column 24. |
0xA | Display data beginning in column 28. |
0xB | Display data beginning in column 32. |
0xC | Display data beginning in column 36. |
0xD | Display data beginning in column 40. |
0xE | Display data beginning in column 44. |
0xF | Display data beginning in column 48. |
Value | Meaning |
---|---|
0x0 | Plain data. |
0x1 | Layer label. |
0x2 | Data label. |
0x3 | header label (for use with PRVx instructions). |
0x4 | Message. |
0x5 | Invisible. |
0x6 | Data dump label. |
0x7 | Indent. |
0x8 | Unindent. |
0x9 | Unused. |
0xA | Unused. |
0xB | Unused. |
0xC | Unused. |
0xD | Unused. |
0xE | Unused. |
0xF | Unused. |
In every decode instruction the label format is one byte. The label format byte can be divided into one 1- bit field (the highest bit) which tells whether a new line should precede the label (the newline bit), one 3- bit field that tells the columnar location to display the label (the label location bits), and one 4-bit field (the lowest bits) which tell the style (color, etc.) in which the label should be displayed (the label style bits).
Bit | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 |
---|---|---|---|---|---|---|---|---|
Field | Line | Format | Style |
Value | Meaning |
---|---|
0x0 | No new line before label. |
0x1 | New line before label. |
Value | Meaning |
---|---|
0x0 | Display label immediately with no spaces. |
0x1 | Skip 2 spaces, then display data. |
0x2 | Skip 4 spaces, then display data. |
0x3 | Display label beginning in column 0. |
0x4 | Display label beginning in column 2. |
0x5 | Display label beginning in column 24. |
0x6 | Display label beginning in column 32. |
0x7 | Display label beginning in column 36. |
Value | Meaning |
---|---|
0x0 | Plain. |
0x1 | Layer label. |
0x2 | Data label. |
0x3 | Header label (for use with PRVx instructions). |
0x4 | Message. |
0x5 | Invisible. |
0x6 | Unused. |
0x7 | Indent. |
0x8 | Unindent. |
0x9 | Unused. |
0xA | Unused. |
0xB | Unused. |
0xC | Unused. |
0xD | Unused. |
0xE | Unused. |
0xF | Unused. |
A | B | C | D | E | F | G | ||
ADDG | BBIT | CBIT | CRLF | D64B | DMPE | EBC# | FCSC | GBIT |
ANDG | BBYT | CEQU | CST# | DBIT | DUMP | ETHR | FTPL | GBYT |
AT01 | BEEP | CGTE | CSTR | DBRK | DWRD | GLNG | ||
AT03 | BGLB | CHR# | DBYT | GSTR | ||||
ATLG | BITO | CKSM | DECR | GWRD | ||||
BLNG | CLSE | DGLB | ||||||
BREM | CLTE | DIVG | ||||||
BST# | CNEQ | DLNG | ||||||
BWRD | ||||||||
BYTO | ||||||||
H | I | L | M | N | O | P | S | |
HBIT | INCR | LABL | MARK | NBNM | ORRG | POPX | SBIT | SKIP |
HBYT | IPLG | LSTE | MODG | NOTG | OSTP | PORT | SCMP | SLTE |
HEX# | IPV6 | LSTS | MOVE | PRTO | SEQU | SLTX | ||
HGLB | LSTZ | MULG | PSTR | SGTE | SNEQ | |||
HLNG | PUSH | SGTX | SUBG | |||||
HWRD | SHFL | |||||||
SHFR | ||||||||
T | W | X | ||||||
TBIT | TRAK | WHOA | XBIT | |||||
TEQU | TRNG | XEQU | ||||||
TGTE | TRTS | XNEQ | ||||||
TGTX | TSB# | XGTE | ||||||
TIME | TST# | XGTX | ||||||
TLSE | TSUB | XLSE | ||||||
TLTE | TTST | XLTE | ||||||
TLTX | TYPE | XLTX | ||||||
TNEQ | XNEQ | |||||||
TNXT | XTST | |||||||
WildPackets, Inc.
1340 Treat Blvd., Suite 500
Walnut Creek, CA 94597 USA
925-937-3200
http://www.wildpackets.com/
sdkhelp@wildpackets.com
Copyright © 1991-2003 WildPackets, Inc.
All rights reserved.